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ABSTRACT. Extracting market expectations has always been an important issue when making national 
policies and investment decisions in financial markets. In option markets, the most popular way has 
been to extract implied volatilities to assess the future variability of the underlying with the use of the 
Black & Scholes formula. In this manuscript, we propose a novel way to extract the whole time varying 
distribution of the market implied asset price from option prices. We use a Bayesian nonparametric 
method that makes use of the Sethuraman representation for Dirichlet processes to take into account the 
evolution of distributions in time. As an illustration, we present the analysis of options on the S&P500 
index. 



1. Introduction 



Derivatives have a significant influence on the behavior of spot markets. Understanding the expec- 
tations of actors in derivative markets can provide helpful insights on general economic conditions 
and the future behavior of the corresponding spot prices (French 1986, |Fama and French 1987 



Tomek, 1997). In futures markets, it is common to use the observed prices of the contracts together 
with a no-arbitrage condition to estimate the implied prices of the underlying asset in the spot market. 
In this case, the implied price corresponds to the net present value of a future contract. This implied 
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price, which is the fair price at inception of the derivative contract, can often be different from the 
price actually observed in the spot market. 

Another useful example is structural credit risk. Extracting the unobserved implied asset price is 
a useful and necessary task in this field, whose purpose is assessing a corporation's credit risk. The 
central distinguishing point of structural credit models is the view of debt, equity, and other claims 



issued by a firm as option derivatives on the firms asset value ( |Black and Scholes! 1 1973^ |Merton 



1973[ 1974[ ). Given that the asset is unobserved, one has to estimate it in order to compute many im- 



portant financial ratios that assess the financial risk of a corporation such as the probability of default, 
the corporation's bond spread, as well as its distance-to-default. The distance to default, defined as 
the distance (measured in standard deviations) of a firms asset value from its default threshold, has 



become popular as a measure of a firm's credit worthiness (Vassalou and Xing, 2004). A popular 



implementation of Merton's structural model Merton (1974) is the commercial KMV model, which 
through the use of the Black and Scholes pricing formula, is able to solve for the value of the asset 
given the equity of the firm, which is a call option. However, it depends critically on the assump- 
tions underlying the Black and Scholes model, in particular, that prices follow a geometric Brownian 
motion. 

Outside of the previous context, extracting the underlying/asset value is rarely done in financial op- 
tion markets; instead, the implied volatility of returns from the Black and Scholes model is typically 
computed. Although this quantity is useful for understanding option prices, it also depends crucially 
on the lognormality assumption of returns and provides no real information about under/over valua- 
tion of assets in the spot market. 
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In principle, no-arbitrage conditions together with interest rates, call and put prices can be used to 
determine option-implied prices for the underlying asset in a similar fashion as for futures and for- 
wards. However, one of the difficulties of using these prices is that, unlike in future markets, multiple 
observations are available for any given expiration date (corresponding to a different strike price), 
each leading to a slightly different implied price. This means that we need to deal with a distribution 
of implied prices, which can be highly non-Gaussian, presenting heavy tails and multimodality. The 
lack of normality implies that simple models based on summaries of the distribution (like the first 
two or three empirical moments) are inappropriate and can be misleading. Another difficulty is that 
a variable number of transactions takes place each day, as some strikes might not be traded. There- 
fore, independent density estimates for each day can be highly unstable, especially in periods of low 
liquidity when few transactions take place. 

In this paper, we propose a Bayesian dynamic nonparametric model to estimate the collection of 
distribution of option-implied prices. This has numerous advantages over models based on empirical 
moments, as they are able to explicitly capture features like multimodality and allow us to estimate 
probabilities of default events that are unavailable under simpler models. Our model is based on the 
Dirichlet process (Fergu son} 1973 1974 ), and is an extension of the dynamic Dirichlet process models 
in 



Rodriguez and ter Horst (2008 ) that incorporates stochastic volatility. It uses an infinite mixture of 



time evolving distributions, leading to a flexible and efficient model that borrows information across 
time periods to improve estimation and prediction. As a motivating example, we analyze European 
option and spot data for the S&P500 between January 1993 and March 1994. However, our argument 
can be immediately extended to other markets and other types of options. 

Some possible alternatives to our model that have been extensively discussed in the literature in- 
clude GARCH and stochastic volatility models ( [Hull and White[ |1987| |Heston and Nandi[ |2000 
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Nicolato and Venardos[ |2003[ ) as well as Markov switching ( |Campolieti and Makarov[ [2005) and 



mixture models ( Schittenkopf et al. , 1998). However, none of these models are appropriate for the 
problem of estimating distributions of implied prices. In particular, these models assume that predic- 
tive distributions are conditionally normal, typically use a fixed number of mixture components for 
the mixtures and/or move all observations at a given time point simultaneously across components. 
These features seriously restrict the shape of the density estimates generated by the model, ruling out 
multimodal and skewed distributions. In contrast, our model provides support over a large class of 
continuous distributions, which allows it to capture the peculiar features of implied-price distribution. 
Also infinite mixtures allow us to automatically deal with the number of components in the mixture as 
a nuisance parameter, eliminating the need to select the number of components. Instead, estimates are 
obtained by averaging over different number of components according to their posterior probabilities. 

We would like to note that inferring the distribution of implied prices discussed in this paper is dif- 
ferent from estimating the risk- neutral distribution ( jAit-Saharla] |1996[ |Ait-Sahalia and Duarte[ |2003 



Panigirtzogl ou and S kiadopoulos, 2004, [Soderlind and Svensson] |1997[ ), as the latter refers to the 
price at expiration while the former refers to the current price. Indeed, to the best of our knowledge, 
there is little precedent in the literature for the implied price distributions we are advocating in this 
paper. 

2. IMPLIED-PRICE DISTRIBUTIONS IN OPTION MARKETS 

A European call option is an instrument that gives the buyer the right, but not the obligation, to buy 
an underlying asset at a fixed price X (called the strike price) at a future time T (called the expiration). 
Similarly, a European put option gives the buyer the right to sell an underlying asset at a fixed price 
X at the expiration time T. For risk management and hedging purposes, options are usually traded in 
pairs. 
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Let C t (X) and Pt{X) be, respectively, the price of a call and put option with strike value X at time 



t < T. A well known no-arbitrage condition (Hull 2005) states that 



(1) C t (X) - P t (X) = S t -X exp (-r t (T - £)) 

where St is the current spot price of the underlying asset and r t is the interest rate available to market 
actors at time t <T. Given the prices C t (X), Pt(X) we can solve for S t to obtain the option-implied 
price for the underlying asset, 

(2) St(X) = C t (X) - P t (X) + Xexp (-r t (T - t)) 

As an illustration, consider the implied prices of the S&P500 between January 4th 1993 and March 
17th 1994 depicted in Figure [T] We concentrate on options with three-month maturity and use the 



LIBOR as the interest rate for all our calculations (Panigirtzoglou and Skiadopoulos[ 2004). The data 



set was constructed by Yacine Ait-Sahalia and has been used in other empirical studies (Duffle et al. 



2000). The subset we employ contains a total of n = 4385 trades spread over T = 306 days, with 
sample sizes in any specific day varying between and 26. Note that the distribution of prices on any 
specific day may be highly skewed and may have very heavy tails. Simple means (corresponding to 
the continuous line) vary wildly, specially during the summer of 1993 when fewer trades occur and 
extreme values are highly influential. 

Option-implied prices obtained from ([2]) may dramatically differ from the spot price prevalent at 
the time. Indeed, the option-implied prices reveal the value that the actors in the derivatives market 
assign to the underlying asset based on their private information, which might be different from the 
information available to other investors. Figure [2] shows the differences between implied and spot 
prices in our sample, A t = SI — S t . Just as before, the distributions of expectations are left skewed 
and possibly multimodal, but no trend is apparent in this case. Implied prices tend to be about 7 points 
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FIGURE 1 . Implied prices in the S&P500 between January 4th 1993 and March 17th 
1994. Multiple observations for any given day correspond to the different strike prices. 
Raw data is represented with dots, while the continuous line shows the evolution of 
daily averages in time. 



lower than the prevalent market price, which represents about a 1.5% difference, something reason- 
able when taking into account the transaction costs involved to synthetically replicate the underlying 
by taking long or short positions in call and put options as well as a zero coupon bond. However, in 
some cases the estimated differences can be as large as 15% of the spot price. 
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Figure 2. Differences between spot and implied prices between January 4th 1993 
and March 17th 1994. The continuous line corresponds to daily means. 

Figure [2] also helps us illustrate the fact that the distribution of implied prices is not the risk- neutral 
distribution, as it corresponds to the current price St and not the prices at expiration, St- Indeed, note 
that the price of the option under the risk neutral density has to converge (in probability) to the market 
price as we approach expiration. This property is clearly not satisfied by the prices shown. 

Some additional information on these distributions is provided Figure[3} which shows four consec- 
utive kernel density estimates for the price differences A t between February 18 and February 25, 1993 
(no observations are available at February 19 and 22). Bandwidths were estimated independently at 
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each time using cross validation ( [Silverman 1986| ). Distributions can change dramatically but are 



generally multimodal and skewed. Bandwidth also change dramatically. From this, it is clear that 
standard parametric models are not a viable alternative in this setting, making non or semi-parametric 
methods a necessity. 

3. BAYESIAN NONPARAMETRIC MODELING FOR COLLECTIONS OF TIME-EVOLVING 

DISTRIBUTIONS 

In this Section, we discuss the statistical model for dynamic density estimation in the context of 
implied-price distributions. We start by reviewing the Dirichlet process and some of its extensions, 
and then move to discuss our model and its computational implementation. 

3.1. The Dirichlet process. Let [X, B) be a complete and separable metric space (typically X = M. n 
and B are the Borel sets on W~ l ), and let K 6 K, be its associated probability measure. A Dirichlet 



process (Ferguson 1973 1974 ) with baseline measure K and precision a, denoted DP(o;_K" ), defines 
a distribution on the space of probability measures /C, such that all distribution K ~ DP(ai^o) if an d 
only if it admits a representation of the form, 



(3) = 



where {rj] } r £L l are independent and identically distributed samples from K and w\ = z\ nL=i(-'- — z t) 
with {z^}^ iid samples from a Beta(l, a). Equation ([3]) is called the stick-breaking representation 
( |Sethuraman |1994| ), and it readily shows that the Dirichlet process places probability one on the 



subspace of discrete distributions. 
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Figure 3. Kernel density estimates of implied differences in the S&P500 prices be- 
tween February 19 and February 25, 1993. The number of observations iV and the 
bandwidth estimated through cross-validation are shown bellow each plot. 
Another consequence of (|3]) is that, for any set B 6 B, K(B) is a random quantity following a Beta 
distribution with parameters a>K(B) and a(l — K(B)). In particular, 

E(K(B)) = K (B) Y(K(B)) = K ° {m - K ° m 

a + 1 
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Therefore, K and a can be interpreted, respectively, as mean and precision parameters. In order 
to make the model more flexible, the DP is typically used to describe the (unknown) distribution of 
the parameters of some continuous distribution H(-\rj), leading to the well known Dirichlet process 



mixture (DPM) models (Lo 1984 Escobar 1994): 



y ~ J H(y\r,)K(dr,) 



K ~ DP(aATo) 



A common choice is H(-\rj) = N(-|ry = (/i, a 2 )), yielding a Gaussian location- scale mixture model 
that is dense in the space of absolutely continuous distributions ( |Lo[ |1984[ ). This model can also be 
interpreted as a Bayesian kernel density estimator, while the sequence {/i*}^ controls the location of 



the Gaussian kernels, the set {c* 2 }^ controls the bandwidth associated with each of them (Escobar 



1994). 



The Dirichlet process is the most widely used nonparametric model for random distributions in 



Bayesian statistics; some recent applications include finance (Kacperczyk et al. 2003 ), econometrics 
( |Chib and Hamilton] |2002||Hirano[|20021 ), epidemiology ( |Dunson[|2005{ ), genetics (Me dvedovic and 
Sivaganesan] |20"0"2] |Dunson et alj [2003] ), medicine ( jKottas et ah] [2002| |Bigelow and Dunson[ |2QQ5| ) 
and auditing ( Laws and O'Hagan] 2002). One of the main reasons for its popularity is the availability 



of efficient computational techniques (Neal (2000) provides an excellent review). Simulation algo- 
rithms for the Dirichlet process can be broadly divided in three groups: marginal samplers, which 



integrate out the unknown distribution K ( |Escobar and West[ [1995 , Ma cEachern and Muller[ [1998] ); 
blocked samplers, which exploit the stick-breaking construction in (|3]) ( |Ishwaran and James] |2001 



Ishwara rTand Zarepour[ |2002{ |Ishwaran and James[ |2002| ); and Reversible Jump samplers ( |Jain and 
Neal, 2000). In particular we focus our attention on marginal samplers, which exploit the Polya Urn 
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representation of the predictive distribution of the process ( |Blackwell an d MacQ ueenj|1973[ ), 

1 



(4) 



"HnVln-li 



l<n 



ft + n — 1 



ft + n — 1 



-Kn 



Due to the exchangeability of the observations, equation d4l) also describes the full conditional dis- 
tributions of any parameter given the rest, which are the basic ingredient required to develop Markov 
chain Monte Carlo schemes to sample from the posterior distribution of this model (details will be 
provided in Section [4]). 

3.2. Dependent Dirichlet processes in discrete time. Dirichlet process mixture models are a natu- 
ral option if we are interested in a nonparametric model for a single probability measure. However, in 
many problems we are interested in how a distribution varies with another variable s G S; for exam- 
ple, when estimating expectations from option markets we would like to know how the distribution of 
option-implied prices changes as time evolves, as well as borrowing information across consecutive 
periods. A lot of recent attention has focused on extending the DP to collections of distributions on 
an index space S. One possible strategy is to introduce dependence through linear combinations of 
realizations of independent Dirichlet processes; some examples include Mii ller et al.| (2004), Dunson 



(2006), Griffin and Steel (2006) and Dunson et al. ( |2004[ ). Another alternative is to replace the ele- 
ments in the stick-breaking representation Q with sample paths from appropriate stochastic process 
in S such that 

oo 

Ks(-) = ^W/( S )^(s)(") 



1=1 



where wi(s) = zf(s) ni=i(l ~~ z k( s ))> {^( s )}^i are independent and identically distributed sample 
paths from a stochastic process {zi(s) : s G S} such that zi(s) ~ Beta(l,a(s)) for all s, and 
{r7^(s)}^ 1 are independent and identically distributed sample paths from another stochastic process 
{77 (s) : s G S}. Note that, for any fixed s, the measure K s follows a regular DP. A construction 
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of this type is called a dependent Dirichlet process (DDP) ( MacEachern[ 20001 ) . Constant weight 



models, where the set of weights {^/(s)}^ are independent of s, are one particularly useful subclass 
of DDPs as they allow for straightforward computational implementation that exploits @. These 
have been used by |DeIorio et a l. (2004) to derive ANOVA models for distributions, Gelfand et al. 



(2005) to creative spatially varying nonparametric priors and by Rodriguez and ter Horst (2008) to 



construct dynamic density estimation models. 

In the sequel, let A it be the i-th difference between the implied price and the corresponding spot 
price observed at day t, for t — 1, . . . , T and i — 1, . . . , n t . For any fixed t, consider location mixtures 
of Gaussian distributions, 

A*t ~ H = J N (A ft |F it t *, o 4 2 ) K t {d0 t ) 

(5) 

i=i 

where w* = z* Yl^X(l — z* k ), z\ ~ Beta(l, a) and F it is a given row vector. For a fixed time t, this 
mixture can be interpreted as a Gaussian kernel density estimate with common bandwidth of for all 
kernels. The relative importance of the kernels is controlled by the mass parameter a; letting a —> 
implies that a single kernel is used and therefore we revert to a parametric (Gaussian) model. Hence, 
we see the mixture as a mechanism to approximate an unknown distribution. 

Setting {0^}^! and of independent from {0^,}^ and of, would lead to density estimates that are 
independent a posteriori. Since we are interested in borrowing information across time, we introduce 
dependence by letting the location of the kernels evolve according to 

(6) 0*|0* ~ N(G t 0T t _ 1 ,W t ) 0^ ~ N(m ,C ). 
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h = s Ct ~ Beta ( 



2 ' 2 



with initial condition a\ ~ IG(s , s S ). As in standard dynamic linear models (Carter and Kohn 



1994 , West and Harrison 1997| ), the evolution of the atoms follows a multivariate random walk. By 



appropriately choosing F it , G t and "W t , the model can accommodate trends, periodicities, autore- 
gressions and dynamic regression models for the location of the kernels. In particular, W t controls 
the magnitude of the change; letting W t — > implies that the location of the kernels is the same at 
every time point. On the other hand, the stochastic volatility component follows the first order au- 



toregressive process developed in Uhlig ( 1997). The discount factor § controls the size of the change 
between adjacent days. Setting 5 = 1 yields a model with constant bandwidth, while lower values of 
5 yield models with less bandwidth smoothing. Since the level of smoothness in the process depends 
so critically on W t and 5, in the sequel we treat these parameters as unknown and estimate them along 
with all other parameters in the model. 

This model can also be interpreted as an infinite mixture of linear filters with common evolution 



parameters but different state parameters (Rodriguez and ter Horst 2008); as a — > the model re 



duces to a regular Kalman filter. Under this interpretation, we can rearrange all terms corresponding 
to the Z-th mixture component to construct ©j" = (0* o , 0* lXl . . . , 0* T )', the Z-th vector of state parame- 
ters. Although multiple observations from different time points can be assigned to the same mixture 
component, y it observation depends on the corresponding only through 0* t . This representation 
will be exploited in Section [4] to develop a computational algorithm for the model. 

The DDP model described above is rich enough to capture the features of the data highlighted in 
Section [2j In particular, it allows for multimodality, stochastic volatility and changing bandwidth 
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(smoothness) in the density estimates. Also, since the no-arbitrage condition makes no assumption 
about the distribution of prices, our approach is more flexible than those based on implied volatilities. 

In order to illustrate the flexibility of the model, consider the moments of the time varying distri- 
butions. Conditional on the mixing distribution K t we have, 



E(y it \K t ) = F' it 



Y(y u \K t ) = a? + F' t 



1=1 



1=1 



i=i 



Cov(yit,yi;t+k\K t ) = F' it 



These expressions show that the process is in general nonstationary; in particular, both the mean 
and the variance of the estimated distributions evolve in time. It is also possible to integrate out the 
unknown distribution K t under the Dirichlet process prior, which yields 



E(y it ) = F' u E(O t ) 

V(y«) = T ^F^V(0 t )F it + E(a t 2 ) 
1 + a 



1 



Co\ (y u ,yi>,t+k) = — — F- 
1 + a 



no 

s=l 



t+k-s+1 



V(0 t )Fi 



,t+k 



where E(6 t ), Y(6 t ) and E(<r t 2 ) can be obtained from the moments of the baseline measure, 
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E(0 t ) 
V(0 t ) 



r=l 
t 



+1 



t-r+l 



m 



Co 



t-r+l 



+ 



r=l 



r=l 



i-1 



r=l 



t-r 

lie 

s=l 



Efo 



n 



5n r — 5 



r=l 



t-a+l 



So 



w 



t-r 



s=l 

So n t > 1 



i-s+1 



+ W t 



Therefore, if the evolution process for the atoms is stationary and F it is constant for every i and 
t, the resulting model for the distributions is a priori centered around a stationary process (even if it 
nonstationary a posteriori). The model described in this section is similar in spirit to that developed in 
Rodriguez and ter Horst ( 2008| ); the main difference is in the treatment of the variance of the Gaussian 



components. While Rodriguez and ter Horst (2008) assume that each component has a different 
variance, its value is fixed in time. This yields models that allow for the variance of the distribution 
to evolve in time in very restrictive ways. In contrast, the model described above keeps the variance 
constant across components but allows it to vary in time, endowing the model with greater flexibility. 

For the application discussed in Section [5] we further specialize the model by setting F it = 1 and 
using a stationary first order autoregressive process for the evolution of the kernel locations, 

i=i VP/ 
This is a stochastic volatlity, distributional autoregressive model. The model is completed by estab- 
lishing priors for the hyperparameters in the model. We give p a N(0, 1) distribution truncated to the 
interval (—1, 1) in order to ensure that the model is centered around a stationarity process. The evolu- 
tion variance U is assigned a conditionally conjugate inverse-gamma priror, U ~ \G(a u /2,b u /2). The 
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discount factor 5 is assigned a uniform prior on the the discrete set {0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.95} 
for computational simplicity. Finally, the global mean jj, is given a N(yU , t 2 ) prior. 



4. Computation 



The use of simulation algorithms to fit Bayesian models has become commonplace in the last 15 
years. In particular, Markov chain Monte Carlo (MCMC) algorithms, which generate a sequence 
of dependent samples from the posterior distribution of the parameters conditional on the data, are 
specially widespread. Given a starting guess for the value of the model parameters, these algorithm 
proceed iteratively by sampling from the full conditional distribtuion of blocks of parameters given 
all others in the model. 

In this section we present a MCMC sampling scheme that exploits the Polya urn representation in 
Q and the reformulation of the model as a mixture of linear filters. In the sequel, let L be the current 
number of components in the mixture ([5]) that have observations allocated to them, n* t be the number 
of observations at time t assigned to group I, n* = J2t n *u> ® = {® i> •••> an ^ S = {erg, a?} 
be the current estimated values for those paths and time-varying variances. Also, £ it = I iff it = Q\ t 
and take negative superscripts to represent the corresponding vector excluding the relevant variables. 
Given values for the structural parameters F it , Gu and W it and after initialization of the parameters, 
an MCMC sampler alternates through the following steps: 

(1) For every I — 1, . . . , L, generate 0*| ■ • ■ using the following FFBS algorithm 
(a) Forward filter using the following recursions 
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J R* - A, t Q, t A' tt ifn? t >0 
Qt = < 

[R lt if ^ = 

A-it = RjtFjtQ^ 1 

e«i = yzt — fj* 

fit = 

Q lt = F' lt R lt F lt + a\\ 

a« = G H m M _i 

Rzt = Gj t Cj jt _iGj t + W; t 

where y Jt is made of all observations assigned to group / at time t and F it is a matrix 
whose rows are the corresponding F it vectors, 
(b) Sample iT \ ■■ ■ from N(m.i T: Cit)- Then recursively sample 0zt|0j,t+i, •• • from iV(d/ t , Dj t ) 
where 

dit = B it (0;, t+ i — aj jt+ i) 
D^t = Qt — B Jt R J)t+1 Bj 4 
B;t = QtGt+iR;^ 

(2) Generate the sequence of variances £|0, • • • using another FFBS algorithm 
(a) Forward filtering using the following recursions 

s t = 8s t -i + n t 

St = ^t-i + E^i(Ait-F it e| <t>t ) 2 

Ss t -i+n t 
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(b) Backward sample, starting with 




~2~' ~~ : ; 



y 21 ) and then letting 




1 



for all < t < T where 



Vt-i 



~ G 



( 



(l-S)s t -i 



2 ' 2 



(3) Sample ^1^-, • • • from a multinomial distribution with probabilities: 

Qi = n *rp( A it\y~,&j i = i,...,l~) 

= npN(A it |h, t ,H Jt ) 
q L+ i = ap(A it \S ) 

= «N (A it |h t0 ,H t0 ) 

As before, h iT = m iT , H/ T = Cit and 

h H = Bu (hj,t+i - sk+i) 



4. 1 . Dealing with unknown hyperparameters. In our application, the value of the evolution hy- 
perparameters /i, p and U, as well as the discount factor 5, are not known a priori and need to be 
estimated from the data. MCMC methods in hierarchical Bayes models can easily deal with this type 
of problems, where the parameters defining the statistical model for the observables of interest are 
unknown; all that is needed is again the full conditional distribution of the unknown hyperparameters 
given all other parameters. 



i,t+i — 
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In the specific setting of the DPM models, sampling is simplified by noting that the realizations 
0*, . . . , @* L are independent and identically distributed samples from the baseline measure defined 
by the evolution equations in ([6]). For example, the full conditional distributions for p and U are 



/'I 



• • • r\j 



N 



LT + 



U 



-1 



U\ IG 



' L T 

EE 

.1=1 t=i 



Ou - p6jt-i 
l-p 



LT 

IT 



-i -V 



+ LT b u + Ef=i Eli IK - A*) - pK-i ~ M 



A similar argument can be used to construct a sampler for the correlation parameter p and the 
discount factor 5. In this case, a conditionally conjugate distribution is not available and we use a fine 
discrete grid to simplify computation while maintaining flexibility. 



4.2. Smoothing and predicting density estimates. The original goal of our analysis is to obtain 
density estimates that borrow information across different periods and predict the shape of the density 
in the future. Given Dt, which stands for all the information up to time T, and the variance of, the 
optimal estimator for the density at time t < T under squared error loss corresponds to the posterior 
predictive distribution, 



(7) 



E 



n{-\T' t O u al)K t {dO t 



D 



T 



H(-\F' t t ,a?)E [K t (dO t )\D T ] 



We call this a filtered density estimate; it represents the best estimate available using all information 
in the sample, both past and future. In the specific case of the nonparametric DLM models discussed 
above, equation (|7J) can be used to derive the density estimates 
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Given a sample from the posterior distribution of the parameters in the model (say, of size R), the 
integral in ([8]) can be easily evaluated for any value of y using Monte-Carlo integration as 

(9) 

R 

h t (y\D T ) « 

r=l 

where the r superscript denotes the r-th sample for the corresponding parameter, r = 1, . . . , R. The 
A;-step ahead density predictions h t +k{-\D t ), corresponding to the best density estimate obtained only 
from past information, can be obtained in a similar way. 



z — ' « + a + -L\ / 

2=1 



5. Illustration: implied market expectation prices for the S&P500 index 

In this Section we apply the model from Section [3] to the data introduced in Section [2] We assume 
m ~ N(t/, k 2 ) where r\ = —10.0 and k 2 = 100.0. This choice reflects approximately the location and 
dispersion of the data. However, results were similar under our sensitivity analysis, which included 
values of r/ between -30.0 and 10.0 and values of k 2 between 25 and 400. Prior parameters for a 2 were 
chosen as s = 1.0 and So = 10.0, while U and a were assigned priors IG(2.0, 1.0) and G(1.0, 1.0) 
respectively. Finally, for the global mean we set /i ~ N(/io, ^ 2 ), with fi = and v 2 = 25. Again, 
results were robust to moderate changes in these prior parameters. 

A variant of the MCMC sampler described in Section [4] was used to fit this model. All results 
are based on 20,000 iterations obtained after a burn-in period of 5,000 samples. No convergence 
problems were evident from inspection of trace plots. Formal assessment of convergence was done 
using the Gelman-Rubin test ( Gelman and Rubin] |1992[ ), which compares the variability within and 



between multiple runs of the sampler with overdispersed starting values. 
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Figure [5] shows density estimates generated by the model for the trading days between February 
18 and February 25 of 1993. These include the dates for the simple estimates in Figure [3} along 
with two additional dates (February 19 and 20) for which data was unavailable. The plots also show 
the original observations in order to demonstrate the plausibility of the estimates. First, we note that 
estimates are dramatically different. This is not really surprising; kernel density estimates are well 
known to be unreliable for very small sample sizes like ours, and standard methods do not borrow 
information across time, favoring enormous difference in estimates across consecutive time points. 
Second, we note that, in spite of the differences and in agreement with the descriptive analysis in 
Section [2j density estimates show negative skewness, with a very heavy left tail. This indicates that a 
relatively large number of actors in the option market tend to seriously undervalue the assets. Unlike 
the volatility smile observed in studies of implied volatility, this conclusion is not an artifact of the 
statistical model but an actually feature of the behavior of market actors. 

Figure [5] displays the sequences of means and medians for the densities estimated by the model. 
Note that 1) both tend to be slightly negative, oscillating between and -10 index points (which is 
expected as there are transaction costs associated with the synthetic portfolio), and 2) the means tend 
to be slightly smaller than the corresponding medians (which indicates again that the distributions 
of differences between market and implied prices are left skewed). Another striking feature is the 
sharp contrast with Figure [2[ our location estimates do not present the wild swings observed in sim- 
ple averages or standard parametric (Gaussian) models. Therefore, the model indicates that, for the 
S&P500, the private valuation of the asset by the "average" investor roughly agrees with the valua- 
tion in the spot market. However, this does not mean that all investors agree with these spot prices. 
Figure [6] shows estimates of volatility for the distributions of differences; high levels of volatility are 
associated with disagreement across actors and vice versa. Since the distributions are skewed and 
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FIGURE 4. Density estimates for A t (the difference between spot and implied prices) 
for all trading days between February 18 and February 25, 1993. Distributions show 
negative skewness and, in some cases, multimodality. 

possibly multimodal, we use the interquartile range rather than the variances to measure volatility. 
In this setup, the interquartile range can be interpreted as the difference between the valuations of 
the 25% most bullish and the 25% most bearish investors. The interquartile range tends to be small 
(below 5 S&P500 points), showing that most investors tend to agree most of the time. However, the 
plot reveals one period of very high volatility in the late summer of 1993, along with three periods 
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Figure 5 . Estimates for the mean and median of the nonparametric density estimates. 
These can be interpreted as the expectations of the mean and median investor. Original 
observations are presented as reference. 

of moderate volatility around January 1993, April 1993 and March 1994. This information is partic- 
ularly interesting as large discrepancies between the beliefs of market actors are thought to influence 
traded volume and returns. Indeed a simple regression of these estimated volatilities against next-day 
trading volumes yields a relatively high correlation, around 0.65. 

Finally, another interesting measure of market expectations arising from our nonparametric density 
estimates is the proportion of market actors with extreme under/over valuation. As an illustration, we 
show in Figure [7] the probability that implicit valuations fall at least 15 points below the spot price. 
Since it is reasonable to assume that transaction costs for a very liquid market like the S&P500 are 
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Figure 6. Interquartile range in the estimated densities. Large volatilities correspond 
to strong discrepancies across market actors with respect to the "fair" value of the 
asset. 



typically well below 3% of the price of the asset, this can be interpreted as the proportion of market 
actors that believe that the spot market is at least mildly overvalued. The results are quite interesting; 
although the typical proportion of investors that consider the stock market overvalued is between 5% 
and 10% most of the time, the proportion can jump to 25% in periods of serious disagreement. In 
contrast, a similar calculation on the upper tail (not shown) reveals that the proportion of market actors 
who think that the market is overvalued is consistently smaller even in periods high volatility. 
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Figure 7. Probability that implicit valuations fall at least 15 points below the spot 
price. This can be interpreted as the proportion of market actors that believe that the 
spot market is overvalued. 



6. Discussion 

We have demonstrated how information about asset prices contained in option markets can be re- 
covered using nonparametric Bayesian methods. The model we propose is flexible enough to capture 
the characteristics of disributions of implied prices, which include skewness and multimodality, and 
does not rely on any of the parametric assumptions underlying the Black & Scholes valuation formula. 
The methodology provides dynamic estimates of the densities and, from them, we are able to derive 
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different measures of expectations, including the behavior of the "mean" investor, the level of agree- 
ment across investors and the proportion of investors that think that the market is under/overpriced. 

We illustrated our methodology using data from the S&P500 option market. In this case, the model 
indicates that the "average" investor in the option market has private valuations for the asset that are 
very close to the spot prices, with the consistently observed differences probably corresponding to 
transaction costs. This is not terribly surprising as both the spot and options markets for the S&P500 
are extremely liquid. We expect that our methods can provide different results in more illiquid mar- 
kets, where large differences between spot and implied prices can happen due to small sample sizes 
and information asymmetry. 

The model also shows that, although most market actors tend to agree with the average investor 
on the price of the S&P500 index, valuations can substantially differ across investors, specially in 
periods of low market volumes. This points again to the utility of these methods in illiquid markets, 
where they can reveal mispricing issues. In addition, we have demonstrated that these periods of 
high uncertainty and disagreement among investors tend to be periods where a large proportion of 
actors consider that the index (and by extension, the stock market as a whole) is overvalued. Surpris- 
ingly, rarely over this period investors seem to think that the stock market is undervalued. This can 
be partially explained by the widespread practice of using the derivative market for hedging; bullish 
transactions in the spot market can be offset by bearish transactions in the option market. However, 
given the enormous size of derivative markets, we think this cannot be the full story and that under- 
pricing indeed reveals private information. 

Furthermore, the recent surge of a new methodological framework assessing sovereign credit risk 
( |Gray et aL , 2007[ Gray and Malone 2008) has opened new doors for policy makers not only to 
manage sovereign credit risk, but also to manage the risk of the sectors composing the economy of a 
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country (households, financial and public). In this context, it is possible to extract from the balance 
sheet of the different sectors of an economy, the observations for the implied asset level (underlying) 
of the sector (Gray et al. 2007 [pray and Malone, 2008), over a given time period. This could allow 
us to study the evolution of the time-varying distribution of the asset level for a particular sector with 
the model developed in this manuscript, and thus to better understand the risk characteristics of the 
sector assets. This will be the theme of a future work. 

Our description of the model ignores bid-ask spreads, whose presence yields intervals of rational 
implied prices rather than a single price. However, in our example, the spreads are so small that 
they can be safely overlooked and the interval replaced by its midpoint. In markets where spreads 
are noticeable, the model can be easily extended by imputing the true prices as part of the MCMC 
sampling scheme. Details will be provided elsewhere. 

We already argued that our methods provides estimates for the distribution of current prices S t , 
and not for the prices at expiration, St- Therefore the implied-price distributions we show in this 
paper do not correspond to implied risk-neutral distributions for the S&P500. However, if the process 
driving the implied prices is assumed to be a martingale (Delbaen and Schach ermayer[ 2006), then 
the expectation of the current price distribution is equal to the expectation of the prices at expiration 
and implied-price distributions can be used to provide information about the risk-neutral distribution. 
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